PlantTFDB
Plant Transcription Factor Database
v4.0
Previous version: v3.0
Transcription Factor Information
Basic Information | Signature Domain | Sequence | 
Basic Information? help Back to Top
TF ID Thecc1EG015114t1
Common NameTCM_015114
Organism
Taxonomic ID
Taxonomic Lineage
cellular organisms; Eukaryota; Viridiplantae; Streptophyta; Streptophytina; Embryophyta; Tracheophyta; Euphyllophyta; Spermatophyta; Magnoliophyta; Mesangiospermae; eudicotyledons; Gunneridae; Pentapetalae; rosids; malvids; Malvales; Malvaceae; Byttnerioideae; Theobroma
Family HD-ZIP
Protein Properties Length: 795aa    MW: 87174.6 Da    PI: 6.372
Description HD-ZIP family protein
Gene Model
Gene Model ID Type Source Coding Sequence
Thecc1EG015114t1genomeCGDView CDS
Signature Domain? help Back to Top
Signature Domain
No. Domain Score E-value Start End HMM Start HMM End
1Homeobox642.2e-20101156156
                       TT--SS--HHHHHHHHHHHHHSSS--HHHHHHHHHHCTS-HHHHHHHHHHHHHHHH CS
          Homeobox   1 rrkRttftkeqleeLeelFeknrypsaeereeLAkklgLterqVkvWFqNrRakek 56 
                       ++k +++t++q++eLe++F+++++p++++r eL+++l L+ +q+k+WFqNrR+++k
  Thecc1EG015114t1 101 KKKYHRHTPHQIQELESFFKECPHPDEKQRLELSRRLALESKQIKFWFQNRRTQMK 156
                       79999************************************************999 PP

2START173.31.5e-543025242205
                       HHHHHHHHHHHHHHHC-TT-EEEE....EXCCTTEEEEEEESSS......SCEEEEEEEECCSCHHHHHHHHHCCCGGCT-TT-S....EEEE CS
             START   2 laeeaaqelvkkalaeepgWvkss....esengdevlqkfeeskv.....dsgealrasgvvdmvlallveellddkeqWdetla....kaet 81 
                       +a++a++el+k+++ ++p+W k      e +n +e++++f++  +     + +ea r++g+v+     lve+l+d + +W e+++    +++t
  Thecc1EG015114t1 302 IALAAMDELIKMVQMDSPLWIKGLdggmETLNHEEYRRTFSSCIGmkpsgYATEATRETGLVFLRGLALVETLMDAN-RWAEMFPcmisRVAT 393
                       6899************************************98888********************************.*************** PP

                       EEEECTT......EEEEEEEEXXTTXX-SSX.EEEEEEEEEEE.TTS-EEEEEEEEE-TTS--.-TTSEE-EESSEEEEEEEECTCEEEEEEE CS
             START  82 levissg......galqlmvaelqalsplvp.RdfvfvRyirqlgagdwvivdvSvdseqkppesssvvRaellpSgiliepksnghskvtwv 167
                       ++v+ss+      ++lq+m ae+q+lsplvp R + f+R+++q+++ +w++vdvS+d  q+  + + +  +++lpSg++i++++n +skvtwv
  Thecc1EG015114t1 394 IDVLSSAtgvtrdNTLQVMDAEFQVLSPLVPvRQVRFLRFCKQHTERVWAVVDVSIDASQDAASAQMFPNCRRLPSGCVIQDMDNKYSKVTWV 486
                       **************************************************************9888899************************ PP

                       E-EE--SSXXHHHHHHHHHHHHHHHHHHHHHHTXXXXX CS
             START 168 ehvdlkgrlphwllrslvksglaegaktwvatlqrqce 205
                       eh +++++ +h llr+l+++g  +ga +w+atlqrqc 
  Thecc1EG015114t1 487 EHSEYDDSAVHHLLRPLLSYGFGFGAHRWLATLQRQCD 524
                       ************************************96 PP

Protein Features ? help Back to Top
3D Structure
Database Entry ID E-value Start End InterPro ID Description
SuperFamilySSF466896.27E-2087158IPR009057Homeodomain-like
Gene3DG3DSA:1.10.10.602.2E-2187158IPR009057Homeodomain-like
PROSITE profilePS5007117.50898158IPR001356Homeobox domain
SMARTSM003896.2E-17100162IPR001356Homeobox domain
PfamPF000465.0E-18101156IPR001356Homeobox domain
CDDcd000862.42E-18101158No hitNo description
PROSITE patternPS000270133156IPR017970Homeobox, conserved site
PROSITE profilePS5084837.517292528IPR002913START domain
SuperFamilySSF559619.75E-30293525No hitNo description
CDDcd088759.99E-110296523No hitNo description
SMARTSM002341.2E-34301525IPR002913START domain
PfamPF018522.3E-46302524IPR002913START domain
SuperFamilySSF559613.71E-18553788No hitNo description
Gene Ontology ? help Back to Top
GO Term GO Category GO Description
GO:0006355Biological Processregulation of transcription, DNA-templated
GO:0005634Cellular Componentnucleus
GO:0008289Molecular Functionlipid binding
GO:0043565Molecular Functionsequence-specific DNA binding
Sequence ? help Back to Top
Protein Sequence    Length: 795 aa     Download sequence    Send to blast
MGARIVVADI VPPSNMLSGA IVEPPLLTQH IPKSMQSSPS LSLSYKRMDA HGEMGLIGEN  60
FDPGLVGRMK EDGYESRSGS DNFEGASGDD QDAADDGRPK KKKYHRHTPH QIQELESFFK  120
ECPHPDEKQR LELSRRLALE SKQIKFWFQN RRTQMKTQLE RHENVILRQE NDKLRAENDL  180
LKQAMSSPTC NSCGGPAVPG EISYEQHQLR IENARLKDEL NRICALTNKF LGRPLSSSAS  240
PIPSQGLNSN LELAVGRNDF GGLNNAGTTL PMGFDFVDGA MMPLMKTMAN EMPYDRSALV  300
DIALAAMDEL IKMVQMDSPL WIKGLDGGME TLNHEEYRRT FSSCIGMKPS GYATEATRET  360
GLVFLRGLAL VETLMDANRW AEMFPCMISR VATIDVLSSA TGVTRDNTLQ VMDAEFQVLS  420
PLVPVRQVRF LRFCKQHTER VWAVVDVSID ASQDAASAQM FPNCRRLPSG CVIQDMDNKY  480
SKVTWVEHSE YDDSAVHHLL RPLLSYGFGF GAHRWLATLQ RQCDCLAVLM SPNIPGEENT  540
GITPAGRKNM LKLAQRMTYN FCAGVCASSV HKWDKLSVGN VGEDVRVMTR KNIDDPGEPA  600
GVVLSAATSV WMPITQQRLF DFLRDERMRS QWDILSNGGP MQGMVKIAKG PGHGNCVSLL  660
RGSAINANEN NMLILQETWS DASGALVVYA PVDISSIGVV MNGGDSAYVA LLPSGFAILP  720
GISPSYHGGQ SNSNGPMVKP DIDGSISGGC LLTVGFQILV NSLPTAKLTV ESVETVNNLI  780
SCTIQKIKAA LTVT*
Regulation -- PlantRegMap ? help Back to Top
Source Upstream Regulator Target Gene
PlantRegMapRetrieve-
Annotation -- Protein ? help Back to Top
Source Hit ID E-value Description
RefseqXP_007038621.10.0Homeobox-leucine zipper family protein / lipid-binding START domain-containing protein isoform 1
SwissprotQ0WV120.0ANL2_ARATH; Homeobox-leucine zipper protein ANTHOCYANINLESS 2
TrEMBLA0A061G1P90.0A0A061G1P9_THECC; Homeobox-leucine zipper family protein / lipid-binding START domain-containing protein isoform 1
STRINGGLYMA10G38280.20.0(Glycine max)
Orthologous Group ? help Back to Top
LineageOrthologous Group IDTaxa NumberGene Number
MalvidsOGEM112827105
Best hit in Arabidopsis thaliana ? help Back to Top
Hit ID E-value Description
AT4G00730.10.0HD-ZIP family protein
Publications ? help Back to Top
  1. Motamayor JC, et al.
    The genome sequence of the most widely cultivated cacao type and its use to identify candidate genes regulating pod color.
    Genome Biol., 2013. 14(6): p. r53
    [PMID:23731509]